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Abstract 

Mitochondria play a fundamental role in cellular metabolism, being responsible for most of the energy production of the cell in the 
oxidative phosphorylation (OXPHOS) pathway. Mitochondrial DNA (mtDNA) encodes for key components of this process, but its 
direct role in adaptation remains far from understood. Hares (Lepusspp.) are privileged models to study the impact of natural selection 
on mitogenomic evolution because 1) species are adapted to contrasting environments, including arctic, with different metabolic 
pressures, and 2) mtDNA introgression from arctic into temperate species is widespread. Here, we analyzed the sequences of 1 1 
complete mitogenomes (ten newly obtained) of hares of temperate and arctic origins (including two of arctic origin introgressed into 
temperate species). The analysis of patterns of codon substitutions along the reconstructed phylogeny showed evidence for positive 
selection in several codons in genes of the OXPHOS complexes, most notably affecting the arctic lineage. However, using theoretical 
models, no predictable effect of these differences was found on the structure and physicochemical properties of the encoded 
proteins, suggesting that the focus of selection may lie on complex interactions with nuclear encoded peptides. Also, a cloverleaf 
structure was detected in the control region only from the arctic mtDNA lineage, which may influence mtDNA replication and 
transcription. These results suggest that adaptation impacted the evolution of hare mtDNA and may have influenced the occurrence 
and consequences of the many reported cases of massive mtDNA introgression. However, the origin of adaptation remains elusive. 

Key words: mitogenomics, Lepus, codon evolution, d^ds, protein structure, positive selection. 



Introduction 

Mitochondria are essential components of the cell as they are 
responsible for the production of most of cellular energy 
through the oxidative phosphorylation (OXPHOS) pathway, 
which takes place in protein complexes embedded in the 
inner mitochondrial membrane (Saraste 1999). The energy 
generated along the electron gradient of the OXPHOS chain 
may be converted in usable energy to the cell, in the form of 



ATP, or released as heat by the uncoupling of the OXPHOS, 
which is, for example, important for the thermogenesis of 
organisms that inhabit cold environments (Cannon and 
Nedergaard 2004). 

The small circular double-stranded mitochondrial DNA 
(mtDNA) molecule is present in numerous copies in each mi- 
tochondrion and encodes 13 of the dozens of proteins that 
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constitute the OXPHOS complexes. Even if these proteins are 
involved in key processes of the cell, the direct impact of 
mtDNA variation on individual fitness, and thus potentially 
on adaptive evolution, has traditionally been a matter of 
debate. The evolution of mtDNA has long been considered 
neutral, mostly for the convenience of phylogenetic and phy- 
logeographic inferences, but evidence that it is subject to 
various sorts of selective pressures accumulate in the literature. 
For example, some deleterious effects of mtDNA variation 
underlie diseases (e.g., Taylor and Turnbull 2005; Tuppen 
et al. 2010) or low male fertility (e.g.. Smith et al. 2010), as 
expected given its maternal transmission. Also, mtDNA has 
been suggested to be involved in adaptation to high altitude 
(Luo et al. 2008; Hassanin et al. 2009; Scott et al. 201 1) or 
temperature (Ballard and Whitlock 2004; Ballard and Rand 
2005; Fontanillas et al. 2005), or to coevolve with nuclear 
genes due to the functional interdependence of mitochondrial 
and nuclear genomes (Rand et al. 2004; Willett and Burton 
2004; Blier et al. 2006; Dowling et al. 2008). Analyses of 
codon sequence variation have often revealed a role for pos- 
itive selection in the evolution of mtDNA, as, for example, in 
salmons (Garvin et al. 2011), whales (Foote et al. 2011), or 
along the phylogeny of mammals (da Fonseca et al. 2008), 
among many others. Also, an exhaustive meta-analysis of in- 
traspecific mtDNA sequence variation has shown that mtDNA 
diversity cannot be predicted by effective population sizes, a 
genetic pattern that is compatible with the occurrence of 
recurrent selective sweeps affecting mtDNA (Bazin et al. 
2006). This may question the use of mtDNA for inferences 
of species historical demography but reveals the great poten- 
tial of studies that attempt to assess the adaptive impact of 
mtDNA evolution (Galtier et al. 2009). 

The interest of studying the adaptive relevance of mtDNA 
evolution is reinforced by relatively frequent observations of 
interspecific mtDNA introgression (see e.g., Toews and 
Brelsford 2012), which are often not paralleled by equivalent 
nuclear introgression (e.g., Roca et al. 2005; Melo-Ferreira 
et al. 2012). Such situations offer natural laboratories to 
study the effect of mtDNA replacement on the evolutionary 
trajectory of populations (Ballard and Whitlock 2004). The 
adaptive nature of mtDNA introgression has been suggested 
in numerous such situations, as, for example, in fish (Nevado 
etal. 2009; Mee and Taylor 2012), amphibians (Sequeiraetal. 
2005; Hofman et al. 2007), birds (Irwin et al. 2009; Brelsford 
et al. 201 1 ), or mammals (Melo-Ferreira et al. 201 1 ; Reid et al. 
2012), among other animal groups. However, disentangling 
the relative contribution of neutral demography and natural 
selection to mtDNA introgression remains a major challenge 
when based on the sole patterns of polymorphism of mtDNA 
sequence in populations (see e.g., Melo-Ferreira et al. 201 1). 

Hares {Lepus spp.) are a particularly suitable group of 
species to study the role of natural selection in the diver- 
gence of mtDNA. First, although the genus diversified recently 
(4-5 Ma; Matthee et al. 2004), over 30 different species are 



currently classified, which colonized the entire globe and are 
well adapted to contrasting environments (Alves and 
Hacklander 2008). These different habitats encompass envi- 
ronmental pressures that have been suggested to shape adap- 
tive mtDNA evolution (e.g., Ballard and Whitlock 2004), as 
different species of hares inhabit for instance arid (e.g., 
Lepus capensis), temperate (e.g., L. granatensis), or arctic re- 
gions (e.g., L. timidus). Second, interspecific mtDNA introgres- 
sion is a common phenomenon among species of hares 
(reviewed by Alves et al. 2008). It has been described both 
in current hybrid zones (e.g., Thulin et al. 2006) and in areas of 
ancient contact between species (Melo-Ferreira et al. 2005). 
The mountain hare, L. timidus, a boreal/arctic species widely 
distributed all over the northern Palearctic, has often been 
found to be the donor, and introgression of mtDNA from 
this species may affect the northern part of the ranges of 
many temperate species distributed across Eurasia, from the 
Iberian Peninsula to China (Alves et al. 2008). Considering 
cases where mtDNA introgression has been demonstrated 
(Melo-Ferreira et al. 2012) or suspected (Alves et al. 2008), a 
dozen different species may have captured mtDNA from 
L. timidus through hybridization. Repeated hybridization 
during the competitive replacement of L. timidus by temperate 
species during deglaciation after the last glacial maximum may 
explain the ubiquity of mtDNA introgression of timidus origin, 
simply due to drift in the front of the wave of expansion 
(Melo-Ferreira et al. 2007). However, recurrent introgression 
of this mtDNA type may also be caused by natural selection 
(Melo-Ferreira et al. 201 1), presumably driven by adaptation 
to cold. The adaptive introgression hypothesis would receive 
credit if one could show that adaptation has contributed to 
the divergence of the donor and receiver species. 

In this study, we analyze 1 1 complete mtDNA sequences 
from hares (1 0 newly obtained), including arctic and nonarctic 
lineages, and assess whether the pattern of codon evolution 
along the phylogeny uncovers events of positive selection. 
Second, we evaluate whether the differences detected 
among lineages imply functional changes of the encoded pep- 
tides. Finally, we characterize mtDNA structure and composi- 
tion of the different lineages. We hypothesize that positive 
selection may have governed the evolution of mtDNA in 
hares, particularly in the arctic lineages. We find evidence of 
positive selection along these lineages and discuss the origins 
of the inferred adaptations. 

Materials and Methods 

Sampling and Sequencing of the Complete mtDNA 

Tissue samples of a total of ten specimens from nine species of 
hares were used in this study (table 1). This sampling was 
designed to span the known phylogenetic tree of hares 
(Melo-Ferreira et al. 2012) and to include native mtDNA of 
species that are well adapted to arctic {L. arcticus, L. othus, 
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Table 1 

Specimens Used in TInis Study (All Individuals Except "eur" Were Newly Sequenced) 



Species 


Conde 


Locality 


Sequence Length (bp) 


GenBank Accession Numbers 


Lepus timidus 


tim 


Finland 


17,755 


KJ397605 


L. corsicanus 


cor 


Corsica, France 


17,056 


KJ397606 


L. arcticus 


arc 


Northwest territories, Canada 


16,868 


KJ397607 


L. othus 


oth 


Alaska, USA 


17,288 


KJ397608 


L. townsendii 


tow 


Wyoming, USA 


17,732 


KJ397609 


L. granatensis 


grajnt 


Leon, Spain 


16,916 


KJ397610 


L. granatensis 


gra_nat 


Huelva, Spain 


17,765 


KJ397611 


L. capensis 


cap 


Mauritania 


16,887 


KJ397612 


L. americanus 


ame 


Montana, USA 


17,042 


KJ397613 


L. californicus 


cal 


Texas, USA 


16,938 


KJ397614 


L. europaeus 


eur 


Skane, Sweden 


17,734 


NC_004028 



and L. timidus), boreal (L. americanus), and temperate 
{L califomicus, L. capensis, and L. granatensis) environments, 
and haplotypes introgressed from L. timidus into other species 
{L. granatensis, L. corsicanus, and possibly L. townsendii) (see 
Melo-Ferreira et al. 2012). 

Extraction of total DNA was performed from ear tissues 
using the EasySpin Genomic DNA Tissue Kit, Citomed, or 
liver tissues with the QIAGEN DNeasy kit, following the man- 
ufacturer's instructions. In L. granatensis, we selected speci- 
mens harboring either L. granatensis or L. timidus mtDNA 
based on the polymerase chain reaction (PCR)-restriction frag- 
ment length polymorphism (RFLP) assay described by Melo- 
Ferreira et al. (2005). The complete mtDNA of the sampled 
specimens was amplified in two overlapping fragments of 
approximately 9,000 bp (specimens tim, tow, and gra_nat in 
table 1) or three overlapping fragments of approximately 
6,000 bp using the primers indicated in supplementary table 
SI, Supplementary Material online, and the long range 
Phusion Flash High-Fidelity PGR Master Mix (Finnzymes) pro- 
tocol. This long-range PGR approach allowed minimizing the 
probability of amplifying nuclear copies of mtDNA sequences 
(NUMTs), because these copies tend to be shorter (Galvignac 
et al. 201 1). Final PGR mix included 6|il of Master Mix and 
0.6 |il of each primer at 20 ^M, in a final volume of 8.6 The 
following PGR cycle conditions were used: 10s at 98 °G; 30 
cycles with 1 s at 98 °G for the denaturation, 5 s for the an- 
nealing, and 4min at 72 °G for the extension, and 1 min at 
72 °G for the final extension step. 

For samples tim, tow, and gra_nat (table 1 ), nested PGRs of 
overlapping portions of approximately 1,200 bp were per- 
formed to increase amounts of amplified DNA (see primers 
in supplementary table SI, Supplementary Material online). 
Nested PGRs were performed in a final mix volume of 9.5 |il, 
using 3.5 \i\ of a PGR Master Mix (QIAGEN Multiplex PGR Kit) 
and 0.2 |il of each primer at 20 (xM, using the following tem- 
perature cycles: 15 min at 94 °G; 35 cycles of denaturation/ 
annealing/extension with 30s at 94 °G for denaturation, 45 s 
for annealing, and 1.15 min at 72 °G for extension and 20 min 



at 60 °G for the final extension step. Nested PGR products 
were sequenced using the forward and reverse PGR primers 
following the ABI PRISM BigDye Terminator Gycle Sequencing 
3.1 standard protocol and using an ABI PRISM 3130 se- 
quencer (Applied Biosystems). 

The long-range PGR products of the remaining specimens 
(table 1) were sequenced using 454 technology and assem- 
bled (using Mira assembler v3.2.1; Ghevreux et al. 1999) by 
Ecogenics (Switzerland). Tablet v.1.12.03.26 (Milne et al. 
2010) was used to assess Phred quality scores and coverage 
along the obtained contigs. Only sites with quality score above 
20 and coverage above lOx were accepted. Sequenced por- 
tions that did not satisfy these criteria were coded as missing 
data. Those portions with missing data were resequenced 
using the nested PGR strategy described earlier and Sanger 
sequencing, which also allowed verifying the concordance of 
the two sequencing procedures. 

Phylogenetic Analysis 

The complete mtDNA sequences of ten specimens sequenced 
in this work and of L. europaeus (GenBank Acc. Nr. 
NG_004028) were aligned using GlustalW (Thompson et al. 
1 994). For the phylogenetic reconstruction, a sequence of the 
European wild rabbit, Oryctolagus cuniculus, complete 
mtDNA was added to the alignment and used as outgroup 
(GenBank Acc. Nr. NG_001913). 

The sequences of the 13 mtDNA protein-coding genes 
were used for phylogeny estimation. The best fit model of 
evolution for each gene was determined with jModelTest 
vO.1.1 (Posada 2008), applying the Akaike information crite- 
rion. The overlapping regions between ATP8/ATP6, ND4I7 
ND4, and ND5/ND6 were considered as a separate partition. 

Phylogenetic inference was performed using the Bayesian 
methodology implemented in BEAST v1 .7.4(Drummond et al. 
2012), applying the appropriate models determined with 
jModelTest, or the next most parameterized model available 
in BEAST. The uncorrelated lognormal relaxed clock 
(Drummond et al. 2006) and the Yule tree prior were used. 



888 Genome Biol. Evol. 6(4):886-896. doi:10.1093/gbe/evu059 Advance Access publication April 1, 2014 



Elusive Nature of Adaptive mtDNA Evolution 



GBE 



Three independent runs of 100,000,000 Markov chain Monte 
Carlo (MCMC) generations, sampling at every 10,000 gener- 
ations, were performed and the stability of the runs checked 
using Tracer v1.5 (Rambaut and Drummond 2009). The first 
1 0% of each run was discarded as burn-in, the results of the 
three runs were combined using LogCombiner v1 .7.4, and the 
information of the trees summarized using TreeAnnotator 
v1.7.4, both part of the BEAST package (Drummond et al. 
2012). 

Tests of Selection 

The effect of natural selection on the evolution of the mtDNA 
protein-coding genes was assessed by comparing the number 
of nonsynonymous changes per nonsynonymous sites (d^) 
with that of synonymous changes per synonymous site (ds) 
using maximum likelihood and Bayesian approaches. Purifying 
selection is characterized by a d^ds ratio (ra) < 1 , whereas 
values > 1 are indicative of positive selection. The phylogeny 
estimated in this work, unrooted and excluding the outgroup, 
was used in all analyses of selection. Also, all methodologies 
described below were applied to three different alignments: 1 ) 
the 13 protein-coding genes concatenated; 2) protein-coding 
genes concatenated by respiratory complex; and 3) single 
genes. These separate analyses allowed verifying whether nat- 
ural selection affected preferentially different OXPHOS com- 
plexes or genes. Also, the concatenation of genes results in 
larger data sets and higher number of variable codons, which 
may help improve the power of the tests (Yang and dos Reis 
2011). 

First, we assessed whether selection could be detected at 
individual codons, independently of the branch of the phylog- 
eny, using the sites models (Nielsen and Yang 1998; Yang 
et al. 2000) implemented in CODEML, in the PAML 4 package 
(Yang 2007). Different ra starting values were used to avoid 
local optima. Nested null (restricting co to values < 1) and pos- 
itive selection (allowing ra to take values > 1) models were 
compared using a likelihood ratio test (LRT): Mia versus 
M2a, M7 versus M8, and M8a versus MB. 

We further assessed the impact of selection along the 
mtDNA phylogeny of hares using additional codon models 
to estimate the rates of synonymous and nonsynonymous 
substitutions (Pond and Frost 2005c). Four methods imple- 
mented on the DATAMONKEY web server (http://vvww.data- 
monkey.org/ [last accessed April 15, 2014]; Pond and Frost 
2005b), Single Likelihood Ancestral Counting (SLAC), Fixed 
Effects Likelihood (FEL), Random Effects Likelihood (REL), 
and PRoperty Informed Models of Evolution (PRIME), were 
used. After reconstructing ancestral sequences, SLAC compa- 
res normalized expected and observed numbers of synony- 
mous and nonsynonymous substitutions per variable site. 
FEL compares the instantaneous synonymous site rate (a) 
and the instantaneous nonsynonymous site rate (p) on a per 
site basis, without assuming a prior d^ds distribution. In 



PRIME, p also depends on the properties of the amino acid 
variants, analyzing two sets of five amino acid properties 
(Atchley et al. 2005; Conant et al. 2007). REL allows for het- 
erogeneity both in synonymous and nonsynonymous rates, by 
fitting discrete distributions of rates across sites and then in- 
ferring the rate per site. Sites with cut-off values of P < 0.25 in 
SLAC and FEL (in small data sets, high nominal a-levels in these 
tests are advised; Pond and Frost 2005c) and P<0.05 in 
PRIME and Bayes factors > 50 in REL were considered as can- 
didates to have evolved under positive selection. In all analyses 
performed in DATAMONKEY, the most suited model of evo- 
lution for each data set, directly estimated on this web server, 
was used. 

Branch models (Yang 1998; Yang and Nielsen 1998), 
which allow ra to vary along branches of the phylogeny, 
were also fit to the data. The likelihood of a single ra (null 
model) was tested against the likelihood of a model allowing 
different ra in each branch (full model) using an LRT. In case of 
rejection of the null model, models considering two ra rates 
were tested. These simplified models were based on the hy- 
pothesis that a distinct selection regime could affect the 
mtDNA lineage of species adapted to arctic or boreal condi- 
tions: arctic lineages versus the rest, and arctic lineage and 
L. americanus (hereafter named arctic/boreal group) versus 
the rest. Finally, the GA-branch model selection analysis 
(Pond and Frost 2005a) implemented in DATAMONKEY was 
used to assess which combinations of rates per branches 
allowed the best fit to the data without prior assumptions. 

The methods described earlier are best suited to detect 
events of pervasive and recurrent positive selection because 1) 
site-based methods average ra over the entire phylogeny 
(Yang et al. 2000), whereas 2) branch-based methods average 
ra over all codons in a protein (Yang 1998). These methods 
may thus be complemented with branch-site tests that focus 
on particular residues along specific lineages, which provide 
increased power to detect episodic selection (Yang et al. 
1995; Murrell et al. 2012). Thus, the branch-sites model 
(Yang et al. 2005; Zhang et al. 2005), implemented in 
CODEML, was also used here. The likelihood of a model (A) 
in which site classes in a background lineage are restricted to 
values < 1, while site classes of the foreground lineage are 
allowed to take values > 1, is compared with the likelihood 
of a null model that fixes ra at 1 in the foreground lineage 
using an LRT. This method has been shown to be rather robust 
to low numbers of codons in the alignment (Yang and dos 
Reis 2011). All individual branches were tested as possible 
foreground lineage. In addition, numerous combinations of 
branches were tested as the foreground lineage, among 
which were branches 1-11 (arctic lineage), branches 1-11 
and 1 3 (arctic/boreal group), branches 3 and 5 (evolution of 
lineage introgressed into L. corsicanus), 3 and 4 (evolution of 
the native arctic species, L. timidus, L. othus and L. arcticus), 
and 5 and 8 (evolution of lineage introgressed into L. grana- 
tensis) (see branch numbers in fig. 1). The Bayes empirical 
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L. americanus 



L. californicus 



L. capensis 



L. corsicanus 



n>.7 



L. othus 
arcticus 



0.9 
S 

' L. granatensis (gra_int) 
L. townsendii 



Arctic 
Lineage 



L. europaeus 

L. granatensis (gra nat) 



d^/d3 rates: 



0,041 
0.030 



0.020 
0.009 



Fig. 1. — Unrooted phylogeny of 1 1 hare specimens (codes in paren- 
thesis correspond to those indicated in table 1 ) based on 1 3 protein-coding 
mtDNA sequences, estimated using Bayesian inference. Branch number is 
indicated on the respective branch and posterior probabilities next to the 
nodes. Two North American hares, Lepus americanus and L. californicus 
are the first to split, followed by the African L. capensis and then the 
European L. granatensis (native haplotype) and L. europaeus. Finally, one 
clade named "arctic" includes the sequences from the arctic/boreal 
L. timidus, L. arcticus, and L. othus, and also from L. corsicanus and 
L. granatensis (temperate species affected by ancient mtDNA introgression 
of L timidus origin) (Melo-Ferreira et al. 201 2), and L. townsendii (a North 
American species for which mtDNA introgression from L. timidus has been 
suspected but was never demonstrated) (Alves et al. 2008; Melo-Ferreira 
et al. 2012). Branch thickness and gray shade indicate the inferred dfj/ds 
rates that provide the best fit to the data as estimated by the GA-branch 
model selection scheme in DATAMONKEY. 



Bayes method was used to identify sites under positive selec- 
tion in case of rejection of the null model (Yang et al. 2005). 

In addition, the mixed effects model of evolution (MEME), a 
branch-site method incorporated in the DATAMONKEY 
server, was used to test for both pervasive and episodic diver- 
sifying selection (Murrell et al. 201 2). MEME is a generalization 
of EEL but models variable co across lineages at individual sites, 
restricting to be< 1 in a proportion p of branches and 
unrestricted at a proportion (1-p) of branches per site. 
Positive selection was inferred with this method for a P 
value < 0.05. 

TreeSAAP (Woolley et al. 2003), which takes into account 
the magnitude of the impact of the amino acid replacements 
on local physicochemical properties, was also applied here. 
Radical magnitudes of changes >6, with P value < 0.001, 
were here considered as indicating directional positive selec- 
tion for a given physicochemical property. Also, only amino 
acid properties that had an accuracy of detection of selection 
higher than 85% were considered (McClellan and Ellison 



2010), and therefore, 11 of the 31 properties analyzed by 
TreeSAAP were excluded. 

Analyses of Protein Function and Three-Dimensional 
Structure 

The potential impact of the observed amino acid substitutions 
was assessed considering their location relative to known 
functional domains of the proteins and the physicochemical 
properties of the amino acid, such as size and charge. A three- 
dimensional protein homology model was built for subunits, 
whenever appropriate templates were available, and the 
amino acid substitutions tested in silico. Protein models were 
built with PHYRE 2 (http://www.sbg.bio.ic.ac.uk/phyre2, last 
accessed April 7, 2014; Kelley and Sternberg 2009) for the 
following subunits (PDBid of the respective template in paren- 
thesis): ND2 (3RK0), ND4 (3RK0), ND5 (3RK0), COXI (1V54), 
COXII (1V55), COXIII (1V54), and CYTB (3CWB). We have 
mutated each residue in the structure into the alternative 
amino acid, and the result was inspected site by site, verifying 
predictable structural alterations due to steric clashes and in- 
specting possible changes in polarity close to relevant protein 
regions such as proton channels and redox centers. This way it 
was possible to evaluate whether the detected amino acid 
changes are or not potentially neutral from the protein func- 
tion point of view. 

Analysis of mtDNA Composition and Structure 

Sequence annotation was performed by using MITOS (Bernt 
et al. 2012) to identify protein-coding genes and rRNAs and 
ARWEN (Laslett and Canback 2008) to identify tRNA genes. 
Gene limits were checked by comparison with orthologous 
mtDNA sequences of published Leporidae species (L euro- 
paeus and 0. cuniculus) and by performing sequence similarity 
in National Center for Biotechnology Information using 
BLASTX (Altschul et al. 1997) against the nonredundant pro- 
tein sequences database. Analyses of variation in start and 
stop codon usage of protein-coding genes and in limits of 
overlapping genes were performed with in-house Perl scripts. 
The origin of light-strand replication was manually identified 
and the minimum free energy of its secondary structure was 
calculated by using the Vienna RNA Websuite, DNA energy 
parameters (Gruber et al. 2008). Pseudo-tRNAs or cloverleaf 
secondary structures — within the major mtDNA noncoding 
region (the control region or D-loop) — were identified using 
ARWEN. 

Results 

Complete mtDNA Sequence Data and 
Phylogenetic Inference 

Sequences of the nearly complete mitogenomes of ten speci- 
mens of hares were newly obtained in this study. Total sizes 
vary from 16,868 bp in L. arcticus to 17,765 bp in native 
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L. granatensis (table 1 , where GenBank accession numbers are 
shown). Some portions of the control region were incomplete, 
and the serine tRNA was incomplete in L arcticus and 
L. californicus. No differences were detected whenever por- 
tions of the mitogenomes were sequenced both with Sanger 
and 454 strategies (total of 19, 176 bp resequenced). The 
long-range PCR approach employed prior to sequencing is 
expected to avoid the amplification of NUMTs. In keeping, 
no protein-coding genes displayed premature stop codon or 
disruption of the reading frame. Also, sequences of the 
overlapping PCR fragments were identical, which would be 
unexpected if some fragment was of nuclear origin. 

The unrooted phylogenetic tree displaying the evolutionary 
relationships inferred for the mtDNA between the 1 1 hare 
specimens analyzed in this study (10 newly sequenced) esti- 
mated using BEAST v1 .7.4 (Drummond et al. 201 2) is depicted 
in figure 1 (see rooted version in supplementary fig. SI, 
Supplementary Material online; the evolutionary models 
selected for each sequence partition are shown in supplemen- 
tary table S2, Supplementary Material online). The obtained 



topology agrees with previous inferences based on shorter 
mtDNA fragments (Melo-Ferreira et al. 2012). 

Evidence of Natural Selection 

We applied several methods designed to detect evidence of 
positive selection affecting mtDNA protein-coding genes 
throughout the phylogenetic tree of hares. Only the sites 
model implemented in CODEML (Nielsen and Yang 1998; 
Yang et al. 2000) did not provide evidence of positive se- 
lection. The suggestions of positive selection of the several 
site-specific analyses are summarized in table 2, where only 
sites with more than one method suggesting evolution under 
positive selection are shown (complete results are in supple- 
mentary table S3, Supplementary Material online). The branch 
of the phylogenetic tree where positive selection was inferred 
is indicated whenever a particular method allowed such infer- 
ence — for example, branch-sites model of CODEML (Yang 
et al. 2005; Zhang et al. 2005), MEME (Murrell et al. 2012), 
orTreeSAAP(Woolleyetal. 2003). For the remaining methods 
(REL, FEL, SLAC, and PRIME; Pond and Frost 2005c), only the 



Table 2 

Sites Affected by Positive Selection According to Seven Different Methods (Only Sites with Suggestion of Selection by Two or More Methods Are 
Shown; See Complete Results in supplementary table S3, Supplementary Material online, Where the Amino Acid Variation Is Indicated) 



Gene Site Tests of Selection^ 







SLAC'' 


REL'' 


fel'' 


PRIME 




MEME'' 


Branch Sites'* 


Branch'' 


TreeSAAP 
Magnitude of Change^ 


ATP8 


38 
















(1; 5; 15)9^ 


6 (1; 15) ; Pa; 6 (5) t Pa 




49 




x** 




X*' (P: 


-7.891) 


(3; 4; 5)** 


(3+5)** 






CYTB 


43 








X* {V: 


-6.997) 


(7)^^ 




(7)* 


8 t pK' 


ND1 


49 






x* 














ND2 


92 




x^ 


x* 














ND4 


20 




X* 


x* 
















187 




X* 


x* 










(1; 13)^ 


6 ; Pa 




337 






x* 










(3; 18)^9'' 


8 t pK' 




351 






x* 










(1; 13)** 


8 ; pK' 


ND4L 


6 




x* 


x* 










(2; 13)^9'' 


8 ; pK' 




21 




X* 












(1; 14)^9'' 


8 t pK' 


ND5 


410 




X* 














8 t pK' 




531 




X* 




x9^ (P: 20.000; V: -10.177; 






(5; 11; 13; 15)* 


6 ; Pa 














OpHi: 18.683) 










ND6 


14 




x^ 












(2; 15)91^ 


8 t pK' 




108 




X9h 


x* 















Note.— SLAC REL, FEL {Pond and Frost 2005c); PRIME (Pond and Frost 2005b); branch sites (Yang et al. 2005; Zhang et aL 2005); MEME (Murrell et aL 2012); and 
TreeSAAP (WooNey et aL 2003). 

^Null model rejection thresholds: TreeSAAP, 0.001; SLAC and FEL, 0.25; PRIME, 0.05; MEME 0.05; branch sites, 0.05; Bayes Factor for REL- 50. 
'^In SLAC, REL, and FEL, rejection of null model is indicated by "x." 

'^In PRIME, rejection of null model is indicated by "x," and the property under selection and its weight are indicated in parenthesis. C, charge; pHi, isoelectric point; P, 
polarity; V, volume. 

''in the branch sites, MEME, and TreeSAAP tests, the branch(es) where positive selection was inferred is indicated (see fig. 1 for correspondence); the Bayes empirical 
Bayes (Yang et al. 2005) posterior probabilities of codon detection of the branch sites results are *0.967 (^) and 0.893 (^). 

^Magnitude of amino acid changes inferred with TreeSaap: increase (f) and decrease (J,) in amino acid properties in a given branch number is indicated. Pa: alpha 
helical tendencies; pK': equilibrium constant (ionization of COOH). 

Data sets: Individual mtDNA protein-coding genes, ^mtDNA protein-coding genes concatenated by OXPHOS complex, and ^all mtDNA protein-coding genes 
concatenated. 
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amino acid substitutions are shown in table 2. A total of 18 
different codons in eight genes were here suggested to have 
evolved under recurrent positive selection by REL, PEL, or 
SLAC, of which ten sites were concordant in two of the 
methods (table 2). PRIME suggested that adaptive evolution 
may have affected five sites (indicated by the negative weigh 
of amino acid properties "isoelectric point," "volume," and 
"polarity"; see table 2 and supplementary table S3, 
Supplementary Material online), three of which were also 
pinpointed by other analyses. MEME, which is particularly 
sensitive to events of episodic selection, suggested that six 
sites of five genes evolved under positive selection, which 
may have affected several lineages along the evolution of 
hares. Only two codons (in the ATP8 and ND4 genes) were 
suggested as a focus of positive selection by the branch-sites 
models analysis of CODEML (table 2; supplementary table S3, 
Supplementary Matenal online). TreeSAAP, which incorpo- 
rates the magnitude of the physiochemical impact of the 
amino acid substitutions in the analysis, identified numerous 
codons (103) as possible focus of positive selection. Among 
the properties in which radical changes were suggested, 
"alpha helical tendencies" and "equilibrium constant (ioniza- 
tion of COOH)" were the most represented (supplementary 
table S3, Supplementary Material online). Several codons (1 5), 
in the ATP8, CYTB, and six of the seven NADH dehydrogenase 
complex genes, were repeatedly reported as positively 
selected by more than one methodology (table 2). Most nota- 
bly, codon 49 of ATP8 and codon 531 of ND5 were indicated 
by five of the eight site-specific methods used here and codon 
1 87 of ND4 by four of these methods. 

In the branch model analysis of CODEML (Yang 1998; 
Yang and Nielsen 1998), the null model of a single co along 
the phylogeny was rejected only in the analysis of the 
complete mtDNA protein-coding genes (P<0.05). 
The model considering two co rates along the phylogeny — in 
the arctic/boreal group (arctic lineage and L. americanus) and 
elsewhere — was significantly favored over the null model 
(P<0.05). The model selection procedure employed using 
GA-branch suggested that the model that optimizes different 
CO rates along the tree is composed of four different rates 
(fig. 1). In any case, the inferred cb rate was consistently well 
below 1 in all estimates obtained. 

Little Impact of Mutations on Protein Structure 
and Function 

The predicted impact of amino acid substitutions on the 
protein function was evaluated for subunits with suitable 
homologs with available protein structure (see supplemen- 
tary table S4, Supplementary Material online). This analysis 
suggested all substitutions to be neutral from a functional 
point of view, both regarding potential local changes in 
structure arising from steric clashes resulting from changes 
in size, and the changes in polarity and charge nearby the 
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Fig. 2. — Graphical representation of Lepus granatensis (gra_int) 
introgressed mtDNA control region cloverleaf structure (at positions 
16224-16303). Nucleotide complementarities are represented by "!" 
or whereas nucleotide mismatches are depicted by "+." 



redox centers and putative proton channels, where even 
small changes may have a large impact. 

mtDNA Composition 

All mitogenomes sequenced have the same gene arrange- 
ment and content typically found in mammalian mtDNAs 
(supplementary fig. S2, Supplementary Material online). No 
variation was found in start codons of protein-coding genes, 
but different stop codons (TAA and TAG) were identified 
within four genes: C0X1 , C0X2, ND5, and ND5. This variation 
should not have functional consequences, as different mam- 
malian mtDNA stop codons are not expected to significantly 
change gene expression. 

A region inside the major noncoding portion of the mtDNA 
was predicted to form stable stem-and-loop secondary struc- 
tures with minimum free energy prediction varying between 
-123.60 and -49.74 kcal/mol. One or more copies of a con- 
served cloverleaf structure (pseudo-tRNA; fig. 2) in this region 
were inferred only in the haplotypes from the arctic species 
(L arcticus: three copies, L. othus: three copies, and L. timidus: 
one copy), L. granatensis carrying the introgressed L. timidus 
haplotype (five copies), L. corsicanus (four copies), and 
L townsendii (two copies). When two or more cloverleaf 
structures were present, they partially overlapped. 

Discussion 

mtDNA may play an important role in the adaptation to par- 
ticular habitats, and groups of species that live in contrasting 
ecological conditions are valuable models to assess this role. 
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We used here a combination of phylogenetic analyses of 
codon evolution and theoretical inferences of the impact of 
amino acid substitutions on protein structure and function to 
understand whether natural selection governed the evolution 
of mtDNA in hares, and to provide insights on the origins of 
the adaptation. 

Nonneutral Evolution of mtDNA in Hares 

Estimates of co rates along the branches of hares' mtDNA 
phylogeny were consistently below 1 , which is indicative of 
pervasive purifying selection affecting all mtDNA protein-cod- 
ing genes (fig. 1 ). This result is not surprising because purifying 
selection has been shown to be a predominant force shaping 
mitogenomic evolution (Ruiz-Pesini et al. 2004; Bazin et al. 
2006; Meiklejohn et al. 2007; Rutledge et al. 2010; Sun 
et al. 2011). However, we found the co rates to vary along 
the tree. Considering either a simplified two-rate model 
(arctic/boreal group and rest of the tree; supplementary 
table S5, Supplementary Material online) or the four-rate 
model (based on the GA-branch model selection procedure; 
fig. 1), higher rates were generally estimated in branches of 
the arctic clade and of L. americanus, a boreal North American 
species also well adapted to cold climates (fig. 1). Note, how- 
ever, that branches represented by a single genome may not 
be representative of the respective evolutionary lineage, and 
inferences of natural selection must, in these cases, be looked 
with caution. Branch 3, which represents a subset of the arctic 
clade, was the sole interior branch where the higher class co 
was estimated. Increased co could result from adaptive evolu- 
tion at some sites, in the face of the dominant purifying se- 
lection regime, or from relaxation of purifying selection in 
some lineages (Bazin et al. 2006; Meiklejohn et al. 2007). 
The latter phenomenon should, however, affect all genes 
alike (Tomasco and Lessa 2011), and the rate variation we 
find among genes in that respect may thus result from 
branch- and gene-specific positive selection (supplementary 
tables S5 and S6, Supplementary Material online). 

Purifying and positive selection may occur at a given site at 
different times during the evolution of a group (Crandall et al. 
1999; Chen and Sun 201 1), and such situations may compli- 
cate inferences of positive selection due to a decreased sensi- 
tivity of the tests (Chen and Sun 2011). Also, the history of 
hare radiation is relatively recent (4-5 Ma; Matthee et al. 
2004), and the relatively low levels of divergence among lin- 
eages (average 0.69 nucleotide substitutions per codon over 
the whole hare phylogeny) leads to a reduced power of the 
tests of selection (Anisimova and Yang 2007). Still, site- and 
branch-site-based tests of selection were able to detect vari- 
ous instances of sites with high di/ds, many reported by dif- 
ferent methodologies (1 5 sites; table 2), and thus that possibly 
evolved under positive selection. 

We addressed this question by applying methods that ac- 
count for site- and branch-specific evolution that are better 



suited at detecting episodic selection on a small fraction of 
amino acids (Zhang et al. 2005; Murrell et al. 201 2). We paid 
particular attention to the hypothesis that selection may have 
had preferential impact in the arctic clade. Indeed, many of 
the reports of positive selection involve the arctic clade or sites 
that distinguish this clade from the remaining hares (e.g., 
internal branches 1, 3, and 4; table 2 and supplementary 
table S3, Supplementary Material online). 

Our results suggest that positive selection may have had a 
particular effect on ATP8, CYTB, and several of the NADH 
dehydrogenase genes (table 2). Previous studies have sug- 
gested that selection in these genes may be related with the 
efficiency of energy conversion (ATP8; Shen et al. 2010; 
Zsurka et al. 2010), which may influence adaptation to cold 
climates (Wallace 2007), the establishment of the proton gra- 
dient necessary for ATP production (CYTB; McClellan et al. 
2005), or the efficiency of the proton-pump (Brandt 2006). 
Interestingly, we found that 30 of the sites reported under 
positive selection in this work coincide with other studies 
(compiled in supplementary table S7, Supplementary 
Material online), some of which proposing a relationship 
with climate adaptation (e.g., Zsurka et al. 2010). It is thus 
striking that the site-specific selection reported in ATP8 lied on 
internal branches of the arctic clade (table 2). In some cases, 
the reported mutations lie on introgressed haplotypes 
(branches 5 and 8; table 2 and supplementary table S3, 
Supplementary Material online), but these were represented 
by single individuals in our phylogeny. It is thus difficult to 
assess whether this sampling is representative of the whole 
introgressed haplotypes and if these inferences indicate 
changes in the selection regime induced by introgression. 
The results obtained here do not demonstrate an adaptive 
drive of mtDNA reticulation (suggested e.g., by Alves et al. 
2008; Melo-Ferreira et al. 201 1) but show that positive selec- 
tion may have punctuated the evolution of mtDNA, most 
notably in the lineages of species thriving in arctic regions, 
eventually by adaptation to cold as suggested in other species 
(Fontanillas et al. 2005). This therefore puts forward a possible 
impact of the high frequencies of mtDNA of arctic origin on 
the evolution and adaptation of the temperate species where 
they are found introgressed. 

On the Origin of mtDNA Adaptation in Hares 

The inference of positive selection on the evolution of mtDNA 
in hares, affecting specific sites, some of which associated 
with lineages of species adapted to boreal climates, questions 
the impact of such mutations on the structure and function of 
the encoded proteins. We modeled the protein structure of 
the encoded proteins of the different lineages and attempted 
to predict the impact of amino acid mutations, given their 
location and consequences on the protein structure and its 
physicochemical properties. Even though the TreeSAAP anal- 
ysis suggested that several of the changes could have had a 
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radical impact on the biochemical properties of the amino acid 
in question (table 2), we assessed that their impact on the 
biochemical structure of the encoded proteins should be re- 
sidual. This analysis was based on the current knowledge on 
functional aspects of mtDNA-encoded proteins (which are 
quite conserved from mammals to bacteria) and on available 
structural models (in several cases, including ATP8, the protein 
three-dimensional structures could not be modeled). 
However, mtDNA-encoded proteins are involved in complex 
interactions with nuclear DNA encoded proteins, and there is 
evidence of cytonuclear coevolution (Rand et al. 2004). This 
interaction is not restricted to the OXPHOS pathway but in- 
cludes many other key processes from the transcription and 
translation of mtDNA proteins to the regulation of the com- 
munication between the organelles (Timmis et al. 2004; 
Cannino et al. 2007; Christian and Spremulli 2012; Xu et al. 
2012). Only an integrated modeling of the cytonuclear pro- 
tein-protein interactions could provide an accurate view of 
the impact of mtDNA nonsynonymous changes, and the func- 
tional impact of the amino acid changes may have been 
underestimated by our analysis. 

Analyses of nonprotein-coding portions of the mtDNA may 
provide additional insights on the nature of mitochondrial 
adaptation in hares, because mutations in the regulatory 
regions of mtDNA may have an effect on mtDNA activity 
(Suissa et al. 2009). Although most of the noncoding part of 
hare mtDNA was highly conserved across lineages, a putative 
cloverleaf structure, with variable copy number, was inferred 
in the control region of the arctic lineage, including the intro- 
gressed L. granatensis and L. corsicanus (fig. 2). The copies 
were estimated to reach high negative free energy in its 
secondary structure, which suggests high stability and high 
probability of spontaneous formation. Such unusual structures 
may influence the efficiency of mtDNA replication and 
transcription (Brown et al. 1986; Pereira et al. 2008) and 
thus regulate mtDNA content in cells, which may impact the 
efficiency of mitochondrial function (Suissa et al. 2009), and 
contribute to adaptation (Cheng et al. 2013). The presence of 
this structure only in the arctic lineage and the apparent 
variable copy number is intriguing and raises the interesting 
possibility that it may have contributed to regulatory-driven 
adaptation. This should be the focus of future research. 

Conclusion 

This study provides new insights into the adaptive nature of 
mtDNA evolution, which may have facilitated the colonization 
of hares of diverse and contrasting environments. Still, the 
origin of the inferred positive selection remains elusive, given 
the small predicted impact of the observed amino acid substi- 
tutions in protein function. We have, however, only evaluated 
this impact in the mtDNA encoded proteins, which are part of 
complex networks with a far greater number of proteins 
encoded by nuclear DNA. The frequent occurrence of 



mtDNA introgression among hares provides natural mtDNA 
replacement experiments that are valuable to investigate how 
coevolution and gene regulation in these networks contrib- 
uted to mtDNA evolution. Two major questions are raised by 
the occurrence of these massive interspecific mtDNA transfers: 
1) was introgression favored by natural selection? 2) and if 
not, could it have a secondary impact on adaptation and 
nucleocytoplasmic coevolution? This work reinforces the 
need to address such questions. 

Supplementary Material 

Supplementary tables S1-S7 and figures SI and S2 are avail- 
able at Genome Biology and Evolution online (http://wvvw. 
gbe.oxfordjournals.org/). 
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